empirical wasserstein distance
Quantifying the Empirical Wasserstein Distance to a Set of Measures: Beating the Curse of Dimensionality
We consider the problem of estimating the Wasserstein distance between the empirical measure and a set of probability measures whose expectations over a class of functions (hypothesis class) are constrained. If this class is sufficiently rich to characterize a particular distribution (e.g., all Lipschitz functions), then our formulation recovers the Wasserstein distance to such a distribution. We establish a strong duality result that generalizes the celebrated Kantorovich-Rubinstein duality. We also show that our formulation can be used to beat the curse of dimensionality, which is well known to affect the rates of statistical convergence of the empirical Wasserstein distance. In particular, examples of infinite-dimensional hypothesis classes are presented, informed by a complex correlation structure, for which it is shown that the empirical Wasserstein distance to such classes converges to zero at the standard parametric rate. Our formulation provides insights that help clarify why, despite the curse of dimensionality, the Wasserstein distance enjoys favorable empirical performance across a wide range of statistical applications.
4f20f7f5d2e7a1b640ebc8244428558c-AuthorFeedback.pdf
We thank the reviewers for their valuable comments and suggestions, which will help us to improve the paper. While "twin-delay" is well-established to help estimate Ns. Besides, we introduce SIA and an asymptotic lower bound for entropy estimation. These details will be added. We will clarify sorting is applied to each individual vector.
Review for NeurIPS paper: Quantifying the Empirical Wasserstein Distance to a Set of Measures: Beating the Curse of Dimensionality
Summary and Contributions: ***** UPDATE ***** I realize I might have been harsh in my evaluation. I believe the paper would have been more suited for a more theory oriented statistics conference / journal, but this is a recurrent problem in NeurIPS and I shouldn't have taken it out on the authors. While their theoretical result is really interesting, I also didn't appreciate that the authors barely mentioned previous work on statistical learning bounds with optimal transport. There have been recent efforts on the topic by several teams, and they should at least acknowledge them. However, if other reviewers took the time to thoroughly review the proof of the main result, I'm willing to increase my score.
Review for NeurIPS paper: Quantifying the Empirical Wasserstein Distance to a Set of Measures: Beating the Curse of Dimensionality
Most of the reviewers were excited about this work, and I'm pleased to recommend it for publication. In the revision, please address all promised changes in the rebuttals and/or requested in the reviews. The outlier R1 has some valid points about the exposition as well as discomfort with the length of the appendix (it's true this is difficult to review in the NeurIPS environment), but these are not reasons to reject the work. That said, the authors of this paper are encouraged to take R1's expository suggestions seriously in their revision to make the work as approachable as possible.
Quantifying the Empirical Wasserstein Distance to a Set of Measures: Beating the Curse of Dimensionality
We consider the problem of estimating the Wasserstein distance between the empirical measure and a set of probability measures whose expectations over a class of functions (hypothesis class) are constrained. If this class is sufficiently rich to characterize a particular distribution (e.g., all Lipschitz functions), then our formulation recovers the Wasserstein distance to such a distribution. We establish a strong duality result that generalizes the celebrated Kantorovich-Rubinstein duality. We also show that our formulation can be used to beat the curse of dimensionality, which is well known to affect the rates of statistical convergence of the empirical Wasserstein distance. In particular, examples of infinite-dimensional hypothesis classes are presented, informed by a complex correlation structure, for which it is shown that the empirical Wasserstein distance to such classes converges to zero at the standard parametric rate.
Wasserstein Distance Guided Representation Learning for Domain Adaptation
Shen, Jian, Qu, Yanru, Zhang, Weinan, Yu, Yong
Domain adaptation aims at generalizing a high-performance learner on a target domain via utilizing the knowledge distilled from a source domain which has a different but related data distribution. One solution to domain adaptation is to learn domain invariant feature representations while the learned representations should also be discriminative in prediction. To learn such representations, domain adaptation frameworks usually include a domain invariant representation learning approach to measure and reduce the domain discrepancy, as well as a discriminator for classification. Inspired by Wasserstein GAN, in this paper we propose a novel approach to learn domain invariant feature representations, namely Wasserstein Distance Guided Representation Learning (WD-GRL). WDGRL utilizes a neural network, denoted by the domain critic, to estimate empirical Wasserstein distance between the source and target samples and optimizes the feature extractor network to minimize the estimated Wasserstein distance in an adversarial manner. The theoretical advantages of Wasserstein distance for domain adaptation lie in its gradient property and promising generalization bound. Empirical studies on common sentiment and image classification adaptation datasets demonstrate that our proposed WDGRL outperforms the state-of-the-art domain invariant representation learning approaches.
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > Middle East > Jordan (0.04)